原始的“七个图案”阐述了科学计算领域的基本方法的路线图,其中图案是一种捕获计算和数据移动模式的算法方法。我们介绍了“仿真智力的九个主题”,是一种开发和整合的路线图,以合并科学计算,科学模拟和人工智能所必需的基本算法。我们称之为合并模拟智能(SI),短暂。我们认为模拟智能的主题是相互连接的和相互依存的,很像操作系统层中的组件一样。使用这种隐喻,我们探讨了模拟智能操作系统堆栈(Si-Stack)和其中图案的各层的性质:(1)多种物理和多尺度建模; (2)替代建模和仿真; (3)基于仿真的推理; (4)因果建模和推理; (5)基于代理的建模; (6)概率编程; (7)可微分的编程; (8)开放式优化; (9)机器编程。我们相信图案之间的协调努力提供了加速科学发现的巨大机会,从综合生物和气候科学中解决逆问题,指导核能实验,并预测社会经济环境中的紧急行为。我们详细说明了Si-stack的每层,详细说明了最先进的方法,提出了示例以突出挑战和机遇,并倡导具体的方法来推进主题和与其组合的协同作用。推进和整合这些技术可以实现稳健且有效的假设仿真 - 分析类型的科学方法,我们用几种使用案例为人机组合和自动化学介绍。
translated by 谷歌翻译
我们概述了新兴机会和挑战,以提高AI对科学发现的效用。AI为行业的独特目标与AI科学的目标创造了识别模式中的识别模式与来自数据的发现模式之间的紧张。如果我们解决了与域驱动的科学模型和数据驱动的AI学习机之间的“弥补差距”相关的根本挑战,那么我们预计这些AI模型可以改变假说发电,科学发现和科学过程本身。
translated by 谷歌翻译
We study non-parametric estimation of the value function of an infinite-horizon $\gamma$-discounted Markov reward process (MRP) using observations from a single trajectory. We provide non-asymptotic guarantees for a general family of kernel-based multi-step temporal difference (TD) estimates, including canonical $K$-step look-ahead TD for $K = 1, 2, \ldots$ and the TD$(\lambda)$ family for $\lambda \in [0,1)$ as special cases. Our bounds capture its dependence on Bellman fluctuations, mixing time of the Markov chain, any mis-specification in the model, as well as the choice of weight function defining the estimator itself, and reveal some delicate interactions between mixing time and model mis-specification. For a given TD method applied to a well-specified model, its statistical error under trajectory data is similar to that of i.i.d. sample transition pairs, whereas under mis-specification, temporal dependence in data inflates the statistical error. However, any such deterioration can be mitigated by increased look-ahead. We complement our upper bounds by proving minimax lower bounds that establish optimality of TD-based methods with appropriately chosen look-ahead and weighting, and reveal some fundamental differences between value function estimation and ordinary non-parametric regression.
translated by 谷歌翻译
在因果推理和强盗文献中,基于观察数据的线性功能估算线性功能的问题是规范的。我们分析了首先估计治疗效果函数的广泛的两阶段程序,然后使用该数量来估计线性功能。我们证明了此类过程的均方误差上的非反应性上限:这些边界表明,为了获得非反应性最佳程序,应在特定加权$ l^2 $中最大程度地估算治疗效果的误差。 -规范。我们根据该加权规范的约束回归分析了两阶段的程序,并通过匹配非轴突局部局部最小值下限,在有限样品中建立了实例依赖性最优性。这些结果表明,除了取决于渐近效率方差之外,最佳的非质子风险除了取决于样本量支持的最富有函数类别的真实结果函数与其近似类别之间的加权规范距离。
translated by 谷歌翻译
我们建议和分析一个强化学习原理,该原理仅在测试功能的用户定义空间沿使用它们的有效性来近似钟声方程。我们专注于使用功能近似的无模型离线RL应用程序,我们利用这一原理来得出置信区间以进行非政策评估,并在规定的策略类别中优化了对策略的优化。我们证明了关于我们的政策优化程序的甲骨文不平等,就任意比较策略的价值和不确定性之间的权衡而言。测试功能空间的不同选择使我们能够解决共同框架中的不同问题。我们表征了使用我们的程序从政策转移到政策数据的效率的丧失,并建立了与过去工作中研究的浓缩性系数的连接。我们深入研究了具有线性函数近似的方法的实施,即使贝尔曼关闭不结束,也可以通过多项式时间实现提供理论保证。
translated by 谷歌翻译
We study the problem of estimating the fixed point of a contractive operator defined on a separable Banach space. Focusing on a stochastic query model that provides noisy evaluations of the operator, we analyze a variance-reduced stochastic approximation scheme, and establish non-asymptotic bounds for both the operator defect and the estimation error, measured in an arbitrary semi-norm. In contrast to worst-case guarantees, our bounds are instance-dependent, and achieve the local asymptotic minimax risk non-asymptotically. For linear operators, contractivity can be relaxed to multi-step contractivity, so that the theory can be applied to problems like average reward policy evaluation problem in reinforcement learning. We illustrate the theory via applications to stochastic shortest path problems, two-player zero-sum Markov games, as well as policy evaluation and $Q$-learning for tabular Markov decision processes.
translated by 谷歌翻译
我们研究了随机近似程序,以便基于观察来自ergodic Markov链的长度$ n $的轨迹来求近求解$ d -dimension的线性固定点方程。我们首先表现出$ t _ {\ mathrm {mix}} \ tfrac {n}} \ tfrac {n}} \ tfrac {d}} \ tfrac {d} {n} $的非渐近性界限。$ t _ {\ mathrm {mix $是混合时间。然后,我们证明了一种在适当平均迭代序列上的非渐近实例依赖性,具有匹配局部渐近最小的限制的领先术语,包括对参数$的敏锐依赖(d,t _ {\ mathrm {mix}}) $以高阶术语。我们将这些上限与非渐近Minimax的下限补充,该下限是建立平均SA估计器的实例 - 最优性。我们通过Markov噪声的政策评估导出了这些结果的推导 - 覆盖了所有$ \ lambda \中的TD($ \ lambda $)算法,以便[0,1)$ - 和线性自回归模型。我们的实例依赖性表征为HyperParameter调整的细粒度模型选择程序的设计开放了门(例如,在运行TD($ \ Lambda $)算法时选择$ \ lambda $的值)。
translated by 谷歌翻译
The goal of a decision-based adversarial attack on a trained model is to generate adversarial examples based solely on observing output labels returned by the targeted model. We develop HopSkipJumpAttack, a family of algorithms based on a novel estimate of the gradient direction using binary information at the decision boundary. The proposed family includes both untargeted and targeted attacks optimized for 2 and ∞ similarity metrics respectively. Theoretical analysis is provided for the proposed algorithms and the gradient direction estimate. Experiments show HopSkipJumpAttack requires significantly fewer model queries than several state-of-the-art decision-based adversarial attacks. It also achieves competitive performance in attacking several widely-used defense mechanisms.
translated by 谷歌翻译
我们研究一类弱识别的位置尺度混合模型,其中基于$ N $ i.d.d的最大似然估计。已知样品具有比经典$ N ^ { - \ frac {1} {2}} $错误的较低的精度。我们调查期望 - 最大化(EM)算法是否也会缓慢收敛这些模型。我们为EM提供了严格的表征,用于在一个单变量的环境中拟合弱识别的高斯混合物,其中我们证明EM算法以$ N ^ {\ FRAC {3} {4}} $步骤汇聚,并返回A处的估计欧几里德订单距离$ {n ^ { - \ frac {1} {8}}} $和$ {n ^ { - \ frac {1} {4}} {4}} {4}}分别从真实位置和比例参数。建立单变量环境中的缓慢速率需要具有两个阶段的新型本地化参数,每个阶段都涉及以人口水平应用于不同代理EM操作员的划分基于epoch的参数。我们展示了几种多元($ d \ geq 2 $)的例子,表现出与单变量案件相同的缓慢。当拟合协方差受到限制为身份的倍数时,我们还在特殊情况下在特殊情况下以更高的尺寸证明了更高的统计率。
translated by 谷歌翻译
We develop and analyze M -estimation methods for divergence functionals and the likelihood ratios of two probability distributions. Our method is based on a non-asymptotic variational characterization of f -divergences, which allows the problem of estimating divergences to be tackled via convex empirical risk optimization. The resulting estimators are simple to implement, requiring only the solution of standard convex programs. We present an analysis of consistency and convergence for these estimators. Given conditions only on the ratios of densities, we show that our estimators can achieve optimal minimax rates for the likelihood ratio and the divergence functionals in certain regimes. We derive an efficient optimization algorithm for computing our estimates, and illustrate their convergence behavior and practical viability by simulations. 1
translated by 谷歌翻译